{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b1acc12a-712e-41e6-8e30-41f9de223543",
   "metadata": {},
   "source": [
    "# Multimodal RAG with LlamaIndex"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c557723-257f-4746-9b84-ec77c50cf405",
   "metadata": {},
   "source": [
    "This notebook shows how to perform RAG on the table, chart, and text extraction results of nv-ingest's pdf extraction tools using LlamaIndex"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c65edc4b-2084-47c9-a837-733264201802",
   "metadata": {},
   "source": [
    "**Note:** In order to run this notebook, you need to have the NV-Ingest microservice running along with all of the other included microservices. To do this, make sure all of the services are uncommented in the file: [docker-compose.yaml](https://github.com/NVIDIA/nv-ingest/blob/main/docker-compose.yaml) and follow the [quickstart guide](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#quickstart) to start everything up. You also need to install the NV-Ingest python client installed as explained in [Step 2: Instal Python dependencies](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#step-2-installing-python-dependencies)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "baecfda5-137b-43da-a8d4-23dd47131be9",
   "metadata": {},
   "source": [
    "To start, make sure that LlamaIndex and pymilvus are installed and up to date"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3d75fdaa-3085-4b77-9289-15276d5cd1c5",
   "metadata": {},
   "outputs": [],
   "source": [
    "pip install -qU llama_index llama-index-embeddings-nvidia llama-index-llms-nvidia llama-index-vector-stores-milvus pymilvus"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45412661-9516-47f9-8bea-6f857e0e173f",
   "metadata": {},
   "source": [
    "Then, we'll use NV-Ingest's Ingestor interface to extract the tables and charts from a test pdf, embed them, and upload them to our Milvus vector database (VDB)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "9337daf9-7342-427a-a95e-2d9ac9b5076b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from nv_ingest_client.client import Ingestor\n",
    "\n",
    "ingestor = (\n",
    "    Ingestor(message_client_hostname=\"localhost\")\n",
    "    .files(\"../data/multimodal_test.pdf\")\n",
    "    .extract(\n",
    "        extract_text=False,\n",
    "        extract_tables=True,\n",
    "        extract_images=False,\n",
    "    )\n",
    "    .embed()\n",
    "    .vdb_upload()\n",
    ")\n",
    "\n",
    "results = ingestor.ingest()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eaf6a9b8-83cf-4061-bf57-d50edc3978d0",
   "metadata": {},
   "source": [
    "Now, the text, table, and chart content is extracted and stored in the Milvus VDB along with the embeddings. Next, we'll connect LlamaIndex to Milvus and create a vector store index so that we can query our extraction results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "a8d693a4-e647-4c20-bea4-56fe52c74541",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex\n",
    "from llama_index.embeddings.nvidia import NVIDIAEmbedding\n",
    "from llama_index.vector_stores.milvus import MilvusVectorStore\n",
    "\n",
    "embed_model = NVIDIAEmbedding(base_url=\"http://localhost:8012/v1\")\n",
    "\n",
    "vector_store = MilvusVectorStore(\n",
    "    uri=\"http://localhost:19530\",\n",
    "    collection_name=\"nv_ingest_collection\",\n",
    "    doc_id_field=\"pk\",\n",
    "    embedding_field=\"vector\",\n",
    "    text_key=\"text\",\n",
    "    dim=1024,\n",
    "    overwrite=False\n",
    ")\n",
    "index = VectorStoreIndex.from_vector_store(vector_store=vector_store, embed_model=embed_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d7e23e9d-a3ad-4b80-a356-1b233633b82d",
   "metadata": {},
   "source": [
    "Next, we'll use our vector store index to create a query engine that handles the RAG pipeline and we'll use [llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct) to generate the final response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "c3eb210e-1106-4956-80a3-f950d30ac6c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from llama_index.llms.nvidia import NVIDIA\n",
    "\n",
    "# TODO: Add your NVIDIA API key\n",
    "os.environ[\"NVIDIA_API_KEY\"] = \"[YOUR NVIDIA API KEY HERE]\"\n",
    "\n",
    "llm = NVIDIA(model=\"meta/llama-3.1-405b-instruct\")\n",
    "query_engine = index.as_query_engine(llm=llm)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2664333b-d30a-4846-801d-e484a18efe45",
   "metadata": {},
   "source": [
    "And finally, we can ask it questions about our example PDF"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "f397a34b-4639-4f8a-81a8-5a5a416404d5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'The dog is chasing a squirrel in the front yard.'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_engine.query(\"What is the dog doing and where?\").response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f2979c08-56a0-441f-87bf-2c6d0c8391a0",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
